EN FR
EN FR


Section: New Results

Computational Vision & Perception

Participants : Matthew Blaschko, Iasonas Kokkinos, Pawan Kumar, Nikos Paragios.

  • Structured Output Ranking & Detailed Understanding of Objects in Computer Vision [Matthew Blaschko]

    In [23] we proposed a novel method for efficiently optimizing an objective that ranks structured outputs by their loss. Based on the observation that structured output spaces [9] in computer vision problems can be well-modeled by a small number of loss values, our algorithm is able to optimize a quadratic number of pairwise constraints in linear time. In [38] we detail the research activities of a summer workshop hosted by Johns Hopkins University on learning a detailed understanding of objects and scenes in natural images. We worked on automatic verification of annotations provided through Amazon Mechanical Turk [35] , texture categorization, and dependence modeling for bottom up proposals.

  • Efficient inference and learning for structured probabilistic models of deformable objects [Iasonas Kokkinos, Haithem Boussaid & Stavros Tsogkas]

    We have developed novel features to describe surface points intrinsically through the Intrinsic Shape Context (ISC) descriptor published in [17] . This method has delivered state-of-the-art results in surface point matching and we will explore its use for surface correspondence. The implementation of these descriptors is publicly available. In [32] we proposed a learning-based approach to symmetry detection by fusing multiple cues related to image intensity, color and texture, which delivered state-of-the-art results. We intend to extend this approach to 3D image analysis, and in particular for medical images. The implementation of these detectors is publicly available. In [27] we introduce a grouping-based method to learn and detect action classes in spatio-temporal data. Our method can both classify actions and indicate the spatio-temporal structures which provide support for the decision. The implementation of our front-end is publicly available. In [40] we have extended our work on efficient algorithms for object detection to accommodate fast methods for computing the part scores in a principled optimization framework, while he have thoroughly presented it in [40] and made the implementation publicly available.

  • Multi-view Image Segmentation & Parsing [Nikos Paragios]

    In [28] a method for image matching was proposed that exploits hierarchical image representations through higher order graphs. The matching was achieved through a graph-based theoretical framework where the similarity and spatial consistency of the image semantic objects is encoded in a graph of commute times that is also endowed with singleton terms through shape descriptors. Many-to-many matching of regions are specially challenging due to the instability of the segmentation under slight image changes, and we explicitly handle it through high order potentials. These ideas were further explored in the context of co-segmentation [29] where a method to determine a consistent partition of multiple images was introduced through a multi-scale multiple-image generative model based on region matching that exploits inter-image information and establishes correspondences between the common objects that appear in the scene. Last, but not least in [24] a method that combines bottom up (visual information, visual descriptors, elements detection) information and top-town models (hierarchical shape grammars) was considered towards automatic facade parsing though reinforcement learning while in [30] a method for 3D image parsing was proposed based on a hierarchical grammar that was performing explicit 3D modeling of the scene through a combination of multi-image segmentation and a depth reconstruction process. The problem optimal combination of these two concurrent terms was addressed trough a pareto-driven criterion while the optimization was addressed through an evolutionary computation algorithm.